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INTRODUCTION 


'  V 

Consider  the  binomial  distribution  6(n, p),  where  the  parameter  n 
represents  the  number  of  trials  and  the  parameter  p  represents  the  proba¬ 
bility  of  success.  When  n  is  fixed  in  advance  after  observing  k  successes, 
the  usual  problem  is  to  estimate  the  probability  of  success  p  in  the  experi¬ 
ment.  Situations  may  also  arise  when  n  becomes  the  unknown  parameter  of 
interest.  If  p  is  assumed  to  be  known  and  k  successes  have  been  observed, 
the  experimenter  would  be  interested  in  estimating  n  instead.  Examples 
include  the  estimation  of  the  total  number  of  herds  in  a  small  area  of 
Kruger  Park,  South  Africa,  with  the  number  of  herds  being  observed  on 
several  occasions.  ;  A  detailed  discussion  of  this  was  described  by  Carroll 
and  Lombard  [l].c-This  kind  of  problem  has  a  direct  implication  to  some 
naval  operations,  more  specifically,  search  problems.  When  some  threats 
have  been  detected  in  a  certain  region,  the  total  number  of  threats  in  that 
region  becomes  a  major  concern.  An  estimate  of  the  unknown  quantity 
would  be  an  important  consideration  in  the  decision-making  process. 

The  next  section  presents  the  background  and  recent  development  of 
the  problem  of  estimating  n  in  a  binomial  distribution.  The  third  section 
derives  the  procedure  of  estimating  n  in  the  form  of  a  confidence  interval. 
The  last  section  consists  of  some  concluding  remarks.  A  simulation  proce¬ 
dure,  an  interactive  computer  program,  and  selected  tables  are  included  in 
the  appendixes. 


BACKGROUND 


The  usual  problem  to  which  a  binomial  distribution  is  applied  is 
the  estimation  of  p  given  k  successes  among  n  trials.  Although  the  prob¬ 
lem  of  estimating  n  has  a  long  history,  it  has  not  been  investigated  in¬ 
tensively  until  recently.  Olkin,  Petkau,  and  Zidek  [2]  noted  that  both  the 
method  of  moments  estimator  and  the  maximum  likelihood  estimator  of 
n  are  “highly  unstable.”  They  proposed  some  stabilized  versions  of  these 
estimators.  Blumenthal  and  Dahiya  [3]  offered  an  alternative  stabilized 
maximum  likelihood  estimator  by  modifying  the  likelihood  function.  Car- 
roll  and  Lombard  [lj  examined  these  estimators  by  applying  a  beta  prior 
distribution  to  the  probability  of  success  p.  They  claimed  that  their  esti¬ 
mators  compared  favorably  with  those  introduced  by  Olkin  et  al  [2].  Most 
recently,  Casella  [4]  proposed  to  assess  the  stability  of  the  estimation  prob¬ 
lem  and  to  choose  an  appropriate  point  estimator  based  on  the  assessment. 
Each  of  the  methods  reported  above  require  complicated  iterations  based 
on  repeated  counts  of  the  number  of  successes.  In  most  of  our  applications, 
it  is  often  not  feasible  to  recreate  the  same  situations  and  to  repeat  the 
experiment  over  again.  Therefore,  the  proposed  procedure  in  this  study  is 
derived  with  the  assumption  that  there  is  only  one  sample  available.  Unlike 
the  works  cited  above,  which  provide  point  estimators  of  the  parameter  n, 
the  computationally  simple  procedure  developed  here  provides  a  confidence 
interval  for  the  parameter. 

This  procedure  is  particularly  useful  for  those  users  who  need  a  quick 
and  easy  estimate  of  n  but  have  only  limited  computer  capability.  Con¬ 
fidence  intervals  are  approximated  by  applying  the  central  limit  theorem. 
Because  the  normal  approximation  of  the  binomial  distribution  is  involved, 
the  confidence  levels  may  not  be  obtained  exactly .  However,  a  simulation 
showed  that  they  can,  on  the  average,  indeed  achieve  the  specified  con¬ 
fidence  levels  indicated  by  confidence  coefficients.  The  procedure  will  be 
detailed  in  the  next  section. 


PROCEDURE 


Let  Y  be  a  random  variable  having  a  binomial  distribution,  6(n,p). 

Then, 


E(Y)  =  np  , 

and 

Var(Y)  =  np(l  -  p)  . 


By  the  central  limit  theorem,  the  fraction 

Y  -  np 
\Jnp(l  -  p) 

converges  in  distribution  to  Z,  where  Z  is  a  random  variable  having  a 
standard  normal  distribution,  JV(0, 1).  Therefore, 


Prob 


Y  -  np 

~za/2  <  / . -  <  za/2 

VnP( 1  -  p) 


where  za/2  is  the  upper  100a/2- percentage  value  from  a  standard  normal 
distribution  table  (e.g.,  2005  =  1-645).  For  a  given  confidence  coefficient 
1  -  o,  a  confidence  interval  for  n  can  be  obtained  by  solving  the  inequality: 

Y  -  np 

~za/2  <  7=  =====  <  za/2  ■ 

V  nP\l  -  p) 

And  the  double  inequality  can  be  reduced  to  the  following  quadratic  in¬ 
equality: 

( Y  ~  np)2  <  zl,2np(  1  -  p)  . 


When  the  number  of  successes  has  been  observed  (i.e.,  Y  =  y),  the 
above  expression  is  a  quadratic  inequality  in  n  and  can  be  solved  by  the 
familiar  quadratic  formula.  The  solutions  would  give  confidence  limits  for 
n  with  a  given  confidence  coefficient  1  -  a. 


V 


"lmi  mi  rwww  w  wwwv  ■!  ».«  *i*v  n 


'U  »1WJ  ■  -  '^PfWi'lR!1 


REMARKS 

The  procedure  described  in  the  preceding  section  provides  an  easy, 
quick  way  to  approximate  a  confidence  interval  for  n  at  a  given  confidence 
level  when  p  is  known.  In  most  naval  operations,  timing  can  be  crucial. 
Scenarios  may  be  changed  in  a  short  period  of  time,  and  it  is  therefore 
necessary  to  produce  estimates  quickly.  The  interactive  computer  program 
would  help  the  users  obtain  an  interval  estimate  of  the  parameter  n  in  a  bi¬ 
nomial  distribution.  The  program  can  be  adapted  to  almost  any  computer. 
Tables  with  confidence  coefficients  90  percent  and  95  percent  are  included 
in  tables  C-l  and  C-2,  respectively.  One  of  the  merits  of  the  procedure  is  its 
simplicity  in  computation.  When  computers  and  tables  are  not  available, 
the  problem  can  still  be  solved  by  using  a  pocket  calculator. 

Because  the  derivation  of  the  confidence  intervals  involves  the  normal 
approximation  to  a  binomial  distribution,  the  resulting  confidence  level 
might  not  be  obtained  due  to  the  approximation.  But  this  was  not  ob¬ 
served  in  a  simulation  study  in  which  5,000  replications  were  simulated  at 
each  combination  of  probability  of  success  and  confidence  coefficient  in  ta¬ 
bles  C-l  and  C-2.  The  results  showed  that  the  percent  of  the  trials  that 
the  confidence  limits  were  calculated  using  the  methodology  reported  here 
did  include  the  known  values  of  the  parameter  n. 

To  apply  this  procedure  effectively,  the  probability  of  success  is  assumed 
to  be  known.  This  assumption  may  not  be  fulfilled  in  practice.  Yet  prior 
information  and  subjective  assessment  could  be  used  in  determining  a  rea¬ 
sonable  and  acceptable  probability  of  success. 
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APPENDIX  A 


In  this  simulation,  it  is  assumed  that  the  confidence  coefficient, 
1  —  a,  and  the  probability  of  success  p  are  specified.  For  the  given  p,  let 
J  be  a  large  positive  integer.  J  may  be  chosen  to  be  the  right  endpoint  of 
the  confidence  interval  with  the  largest  possible  number  of  successes  under 
consideration.  For  example,  when  1  —  a  =  90  percent  and  p  =  0.1,  J  may 
be  chosen  to  be  341.  (See  table  C-l.) 


After  J  has  been  determined,  select  a  random  integer  n,  between  1  and 
J,  for  i  =  1,2,  ...  ,  B,  where  B  is  the  number  of  replicates  intended  for 
this  simulation.  Then  select  rii  random  numbers  between  0  and  1,  say 


(n.) 


Define 


1  if  p[a)  <  p 
0  otherwise  . 


Let 

k,  =  a,  . 

«=i 

k,  represents  then  the  number  of  successes  in  the  ith  simulation.  According 
to  this  kt,  there  exists  a  confidence  interval  [/,,  u,].  This  confidence  inter¬ 
val  can  be  either  computed  by  using  the  interactive  computer  program  in 
appendix  B  or  obtained  from  the  tables  included  in  appendix  C.  Define 


bi  = 


1  if  /,  <  n,  <  u, 
0  otherwise  . 


Repeat  the  process  B  times  in  the  similar  manner.  And  then. 


1  -a  = 


will  give  the  percentage  that  n  is  included  in  the  intervals  for  a  given  p. 
If  1  -  a  is  close  to  1  -  a,  the  proposed  method  would  be  satisfactory  for 
approximating  confidence  intervals  for  n. 


In  this  study,  with  1  —  a  =  90  percent  and  95  percent  and  B  =  5,000 
replications,  the  results  show  that  1  -  a  is  approximately  equal  to  90  per¬ 
cent  and  95  percent,  respectively.  For  other  values  of  1  -  a,  simulation 
also  shows  similar  results.  It  indicates  that  the  proposed  method  provides 
a  quick  and  easy  way  to  construct  confidence  intervals  for  the  parameter  n 
in  a  binomial  distribution. 
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AN  INTERACTIVE  COMPUTER  PROGRAM 
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CONFIDENCE  INTERVAL  FOR  PARAMETER  n  IN  A  BINOMIAL  DISTRIBUTION 


THE  PURPOSE  OF  THIS  CODE  IS  TO  ACCEPT  INPUT  PARAMETERS 
(CONFIDENCE  LEVEL.  AND  PROBABILITY  OF  SUCCESS). 

AND  INVOKE  A  ROUTINE  TO  COMPUTE  CONFIDENCE  INTERVALS 
USING  THE  CENTRAL  LIMIT  THEOREM  METHODOLOGY.  A  REPORT 
DEPICTING  CONFIDENCE  INTERVALS  IS  PRODUCED  AND  WRITTEN 
TO  THE  CRT. 

PROGRAM  INPUT  1).  CONLEV 
2)  PRSUCC 

PROGRAM  OUTPUT  1).  CONFIDENCE  INTERVALS 

VARIABLE  DEFINITIONS 

PRSUCC  -  PROBABILITY  OF  SUCCESS 

CONLEV  -  CONFIDENCE  LEVEL 

VT.MATRIX  -  COMPUTED  TABLE  OF  LOWER  AND  UPPER  LIMITS  OF  CONFIDENCE 

INTERVALS  FOR  E  -  1  THROUGH  25  SUCCESSES.  AND  FOR 
SELECTED  CONFIDENCE  LEVEL  AND  PROBABILITY  OF  SUCCESS 
K  -  NUMBER  OF  SUCCESSES 

DATA  DECLARATION/ INITIALIZATION 

INTEGER  VT_MATRIX (25.2).  CONLEV 
REAL  K 

ACCEPT  CONFIDENCE  LEVEL  FROM  INPUT  DEVICE,  EDIT  VALUE. 

AND  ASSIGN  APPROPRIATE  Z  VALUE 

25  PRINT  ' 

PRINT  * ,  'ENTER  CONFIDENCE  LEVEL 
PRINT  *.  '  ' 

PRINT  *,  'CONFIDENCE  LEVEL  MUST  BE  60.  70.  80.  90,  95,  98.  OR  99' 
PRINT  •.  ’  ' 

ACCEPT  30.  FLOAT.CONLEV 
30  FORMAT  (F80.0) 

CONLEV  -  FLOAT_CONLEV 
IF  (CONLEV  .EQ.  99)  THEN 
Z CONST  -  2 . 576*  *2.0 
ELSEIF  (CONLEV  .EQ.  98)  THEN 
Z CONST  -  2 . 326*  *2.0 
ELSEIF  (CONLEV  .EQ.  95)  THEN 
Z CONST  -  1.960* *2.0 
ELSEIF  (CONLEV  .EQ.  90)  THEN 
Z CONST  -  1.645* *2.0 
ELSEIF  (CONLEV  .EQ.  80)  THEN 
Z CON ST  -  1.282* *2.0 
ELSEIF  (CONLEV  .EQ.  70)  THEN 
Z CONST  -  1.040* *2.0 
ELSEIF  (CONLEV  .EQ.  60)  THEN 
Z CONST  -  .8400* *2.0 
ELSE 

GOTO  25 
ENDIF 

ACCEPT  PROBABILITY  OF  DETECTION  FROM  INPUT  DEVICE  AND  EDIT  VALUE 
35  PRINT  * .  '  ' 
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PRINT  •.  'ENTER  PROBABILITY  OF  DETECTION  (MUST  BE  BETWEEN  .01  AND 
ar.  50)' 

PRINT  *.  '  ' 

ACCEPT  40.  PRSUCC 
40  FORMAT  (F80.2) 

IF  (PRSUCC  .GE.  .01  .AND.  PRSUCC  .LE.  .50)  THEN 
GOTO  45 
ELSE 

PRINT  •.  '  ' 

PRINT  •.  'PROBABILITY  OF  DETECTION  MUST  BETWEEN  .01  AND  .50' 
GOTO  35 
BNDIF 

INVOKE  ROUTINE  TO  COMPUTE  CONFIDENCE  INTERVALS 

COMPUTE  FUNCTION  CONSTANTS 

45  A  -  PRSUCC*  *2.0 
DO  46  1-1,25 
K  -  I 

B  -  -PRSUCC  *  (  (2 . 0*K)  +  ( ZCONST* ( 1 . 0-PRSUCC) )  ) 

C  -  K* *2.0 

INVOKE  QUADRIATIC  FORMULA  USING  FUNCTION  CONSTANTS 

DISCRI  -  (B*  *2.0)  -  (4. 0*A*C) 

SQRTDI  -  SQRT ( DISCRI ) 

XL  -  (-B  -  SQRTDI )  /  (2.0*A) 

XU  -  (-B  +  SQRTDI)  /  (2 . 0*A) 

STORE  LOWER  AND  UPPER  ENDPOINTS 

IXL  -  XL 
IXU  -  XU 

VT_MATRIX( 1,1)  -  IXL+1 
VT_MATRIX( 1,2)  -  IXU 

48  CONTINUE 

GENERATE  REPORT 

WRITE  (6,50) 

50  FORMAT  ( 111 , 25X. ' CONFIDENCE  INTERVALS') 

WRITE  (6,55) 

55  FORMAT  ( 1H0 . 15X, ' "NUMBER  OF  OBJECTS  DETECTED  IN  A  REGION"') 

WRITE  (6,60)  CONLEV 

60  FORMAT  ( 1H0 , IX ,' CONFIDENCE  LEVEL  -  ’,13,'%') 

WRITE  (6.65)  PRSUCC 

65  FORMAT  ( 1H  , IX , ' PROBABILITY  OF  DETECTION  -  ’ , F3 . 2 ) 

WRITE  (6,70) 

70  FORMAT  (1HO,15X,'NO  OF  SUCCESSES 1 IX ,’ LOWER ', 8X . 

9  'UPPER') 

WRITE  (6,75) 

75  FORMAT  ( 1H  ,15X,'  OBSERVED  (K) '. 13X .' LIMIT SX , 

«f  'LIMIT') 

WRITE  (6,80) 

80  FORMAT  (  1H  ,15X,'  - ',13X,' - '.8X, 

V  ' - ') 

DO  110  1-1,25 
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* 

3 


% 


Ira 


V.  V. 


WRITE  (6.100)  I.  VT_MATRIX( 1,1),  VT_MATRIX(I . 2) 
100  FORMAT  (1H  .21X.I2. 18X.I4.9X.I4) 

110  CONTINUE 

STOP 

END 
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TABLE  C-l 


90-PERCENT  CONFIDENCE  INTERVAL  FOR  n 


Number  of 
successes 


1 

2 

3 


Probability  of  success 
1  ~2  1 


I 


TABLE  C-2 


95-PERCENT  CONFIDENCE  INTERVAL  FOR  n 


Number  of 

successes 

Probability  of 

success 

.01 

.1 

.2 

.3 

.4 

.5 

1 

18,562 

2,52 

2,24 

1.14 

1,10 

1,7 

2 

56,725 

6,68 

4,32 

3,20 

2,13 

2,10 

3 

103,877 

11,83 

6,39 

5,24 

4,17 

3,13 

4 

157,1024 

17,98 

9,46 

6,29 

5,21 

5,15 

5 

215,1165 

23,112 

12,53 

9,34 

7,24 

6,18 

6 

277,1304 

29,126 

15,60 

11,38 

9,27 

7,20 

7 

341,1440 

36,139 

19,67 

13,42 

10,30 

9,23 

8 

407,1573 

42,152 

22,73 

16,47 

12,33 

10,25 

9 

475,1705 

49,165 

26,80 

18,51 

14,37 

12,28 

10 

545,1835 

56,178 

29,86 

20,55 

16,40 

13,30 

11 

616,1964 

‘64,191 

33,92 

23,59 

18,43 

15,33 

12 

689,2091 

71,203 

37,99 

26,63 

20,46 

17,35 

13 

762,2218 

79,216 

41,105 

28,68 

22,49 

18,38 

14 

837,2344 

86,228 

44,111 

31,72 

24,52 

20,40 

15 

912,2469 

94,241 

48,117 

33,76 

26,55 

22,42 

16 

988,2593 

101,253 

52,123 

36,80 

28,58 

23,45 

17 

1064,2716 

109,265 

56,129 

39,84 

30,61 

25,47 

13 

1142,2839 

117,278 

60,135 

41,88 

32,64 

27,49 

19 

1220,2961 

125,290 

64,141 

44,92 

34,67 

28,52 

20 

1298,3082 

133,302 

68,147 

47,96 

36,70 

30,54 

21 

1377,3203 

141,314 

72,153 

50,99 

38,73 

32,56 

22 

1456,3324 

149,326 

76,159 

52,103 

40,75 

33,59 

23 

1536,3444 

157,338 

80,165 

55,107 

42,78 

35,61 

24 

1617,3564 

165,350 

85,171 

58,111 

45,81 

37,63 

25 

1697,3683 

173,361 

89,177 

61,115 

47,84 

38,65 

C-2 


